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Abstract 

Background: The identification of the loci and specific alleles underlying variation in quantitative traits is an 
important goal for evolutionary biologists and breeders. Despite major advancements in genomics technology, 
moving from QTL to causal alleles remains a major challenge in genetics research. Near-isogenic lines are the ideal 
raw material for QTL validation, refinement of QTL location and, ultimately, gene discovery. 

Results: In this study, a population of 75 Arabidopsis thaliana near-isogenic lines was developed from an existing 
recombinant inbred line (RIL) population derived from a cross between physiologically divergent accessions Kas-1 
and Tsu-1. First, a novel algorithm was developed to utilize genome-wide marker data in selecting RILs fully 
isogenic to Kas-1 for a single chromosome. Seven such RILs were used in 2 generations of crossing to Tsu-1 to 
create BC1 seed. BC1 plants were genotyped with SSR markers so that lines could be selected that carried Kas-1 
introgressions, resulting in a population carrying chromosomal introgressions spanning the genome. BC1 lines were 
genotyped with 48 genome-wide SSRs to identify lines with a targeted Kas-1 introgression and the fewest genomic 
introgressions elsewhere. 75 such lines were selected and genotyped at an additional 41 SNP loci and another 930 
tags using 2b-RAD genotyping by sequencing. The final population carried an average of 1.35 homozygous and 
2.49 heterozygous introgressions per line with average introgression sizes of 5.32 and 5.16 Mb, respectively. In a 
simple case study, we demonstrate the advantage of maintaining heterozygotes in our library whereby 
fine-mapping efforts are conducted simply by self-pollination. Crossovers in the heterozygous interval during this 
single selfing generation break the introgression into smaller, homozygous fragments (sub-NILs). Additionally, we 
utilize a homozygous NIL for validation of a QTL underlying stomatal conductance, a low heritability trait. 

Conclusions: The present results introduce a new and valuable resource to the Brassicaceae research community 
that enables rapid fine-mapping of candidate loci in parallel with QTL validation. These attributes along with dense 
marker coverage and genome-wide chromosomal introgressions make this population an ideal starting point for 
discovery of genes underlying important complex traits of agricultural and ecological significance. 

Keywords: 2b-RAD, Fine-mapping, Quantitative trait loci, Stomatal conductance 



Background 

Linkage mapping of QTL is a common statistical approach in 
plant genetics where recombinant populations generated 
from crosses between inbred parent lines are used, in combi- 
nation with molecular markers, to identify loci associated with 
variation in continuously distributed traits [1-8]. Mapping 
populations common to QTL analyses are many and include 
doubled haploids (DH), F2, backcross, advanced intercross, 
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nested association mapping and RILs. Mapping QTL for 
complex traits is now routine, with the typical output being 
QTL spanning large confidence intervals encompassing many 
(hundreds or more) possible causal genes [9]. 

The steps following QTL identification frequently involve 
functional validation of the QTL, and refinement of loca- 
tion (fine-mapping) towards the goal of identification of a 
causal gene - the major challenge in quantitative genetics 
today [10]. One of the most common approaches for 
accomplishing these objectives is through the development 
and phenotypic characterization of NILs [11]. The gene- 
ration and phenotyping of NILs is considered a laborious 
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and time consuming process, but the robust design leads to 
a minimal false positive rate. 

NILs are lines containing a single or small number of gen- 
omic introgressions from a donor parent in a different and 
otherwise homogeneous genomic background. By hom- 
ogenizing all genetic factors outside of the focal genomic 
region, the true effect of the QTL on the phenotype can 
be estimated relative to the line into which the introgres- 
sion was introduced (i.e. void of the chromosomal intro- 
gression) [12]. In addition to the simplification of genetic 
analyses, NILs are considered genetically 'immortal' [13] 
which allows for replicated experiments across multiple 
environments resulting in more accurate estimates of effect 
size for complex traits. NILs have proven to be an effective 
resource for QTL validation and a logical starting point for 
the creation of fine-mapping populations [14-21]. 

Creation of a single near-isogenic line generally starts by 
crossing a line carrying the targeted QTL region to one of 
the parental lines of the population, thus creating a back- 
cross population. Genome-wide genotyping of the back- 
cross progeny is performed to identify recombination 
events allowing for selection of progeny which carry the 
target chromosomal introgression derived from the donor 
and recurrent parent genome elsewhere. Subsequent gen- 
erations of self-pollination (selfing) are normally required 
to achieve homozygosity of the introgressed region and 
the process can take several backcrossing cycles to pro- 
duce a NIL carrying an introgression of acceptable size 
and genomic location. An alternative approach has been 
the use of heterogeneous inbred families (HIFs) where 
NILs are selected from incompletely inbred lines which 
still harbour a small amount of heterozygosity at random 
intervals across the genome [22,23]. Analysis of a HIF 
population with molecular markers allows for the selection 
of lines heterozygous at a candidate genomic location, 
which in combination with further selfing and genotyping, 
enables selection of NILs derived from several heteroge- 
neous genetic backgrounds. Producing NILs with smaller 
introgressions requires greater effort. Large populations 
are needed to break up small chromosomal segments, and 
high-density genotyping is required to discover them. 

A NIL library is a family of near-isogenic lines where each 
line carries a different donor parent fragment and the 
population carries introgressions spanning the entire ge- 
nome [24]. A NIL library is an ideal starting point for QTL 
validation, especially in cases where the library is derived 
from parent lines for which an immortal recombinant 
population (i.e. RILs, DH, etc.) already exists. In this case, 
QTL identified via traditional linkage mapping experiments 
performed on the mapping population can be immediately 
tested by selecting NIL(s) representing the QTL introgres- 
sion and testing them for a phenotypic effect relative to the 
wild type recurrent parent. NIL libraries are also valuable 
starting material for fine-mapping QTL through the 



creation of sub-NILs [25], recombinant lines in which the 
original NIL introgression is broken into smaller genomic 
fragments. In this case, a candidate NIL is backcrossed to 
the recurrent parent and the progeny are genotyped using 
markers specific to the introgression region so that indivi- 
duals carrying genomic fragments spanning the length of 
the original introgression can be identified. Subsequent 
phenotyping of the sub-NILs provides finer resolution of 
the region controlling the trait of interest, effectively 
narrowing the list of possible causal genes. 

Several NIL populations are currently available to the 
Arabidopsis research community. Koumproglou et al. 
[26], using 31 simple sequence repeat (SSR) markers, 
created a population of Chromosome Substitution Strains 
by replacing chromosomes from the accession Columbia 
(Col-0) with homologous chromosomes from the accessions 
Landsberg erecta (Ler) and Niederzenz (Nd). Additionally, 
a population of more traditional NILs were created in a 
systematic approach where increasing lengths of chromo- 
somal introgressions were introduced from Ler into the 
Col-0 background. Keurentjes et al. [27] generated a popu- 
lation of 92 NILs carrying genome-wide chromosomal 
introgression from the accession Cape Verde Islands (Cvi) 
into the Ler background. Selections were made from the ge- 
notyped RIL mapping population described by Alonso- 
Blanco et al. [28] and used in backcrosses to create the 
NIL library. The RIL population has been mapped for 
QTL underlying flowering time and carbon isotope ratio 
(6 13 C) [29], recombination frequency [30], seed germi- 
nation [31], seed mineral concentration [32] and fructose 
sensitivity [33]. The same 321 AFLP (Amplified Fragment 
Length Polymorphism) markers used to build the RIL map 
were used in the NIL breeding scheme. Finally, Torjek et 
al. [34] created a population of 140 reciprocal NILs from 
the accessions Col-0 and C24 (78 NILs in the Col-0 back- 
ground and 62 lines in the C24 background) utilizing a 
total of 125 markers [35]. This NIL library has been used 
in subsequent studies of epistasis [36] and heterosis [37]. 

Here we report the development of a new population of 
75 NILs constituting genome-wide chromosomal introgres- 
sions. The NIL population exploited inbred lines selected 
from the RIL population described in McKay et al. [38] as 
the starting material for backcrossing. Briefly, the RIL 
population is derived from a cross between the A. thaliana 
ecotypes Tsu-1 (CS1640), an accession originating from 
Tsushima, Japan and Kas-1 (CS903), an accession origina- 
ting from Kashmir. These sites of collection are among 
the wettest and driest habitats, respectively, in the A. 
thaliana species range and the accessions differ in sev- 
eral aspects of drought physiology [39,40]. Recombinant 
populations derived from these diverse accessions will 
therefore segregate alleles underlying variation in these 
physiological traits, providing a powerful resource for 
identifying functional genes. 
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We developed a population of 75 Arabidopsis thaliana 
NILs containing both homozygous and heterozygous intro- 
gressions, enabling simultaneous pursuit of QTL validation 
and fine-mapping. Genotyping the population with over 
1,000 molecular markers has provided us with excellent 
resolution on the total number of introgressions existing in 
each NIL as well as their location and length. It is the most 
densely genotyped NIL population developed thus far by 
more than 3-fold. The utility of the NIL library is demon- 
strated in a simple case study where, in a single generation, 
we utilize a homozygous NIL to validate and localize a 
QTL for a low heritability physiological trait (g 0 ; night-time 
stomatal conductance) while concurrently selfing heterozy- 
gous selections to create sub-NILs for further fine-mapping. 

Results 

Marker-assisted NIL breeding program 

Figure 1 shows the breeding design for the NIL library. An 
algorithm was developed [see Additional file 1] to select 
RILs homozygous for Kas-1 alleles across one of each of the 
5 Arabidopsis chromosomes. The results found 7 such RILs 
from the population of 346. These RILs were crossed to 
Tsu-1 and progeny were genotyped to confirm they were 



truly Fls. These were then crossed back to Tsu-1, creating 
25 BC1 families. Plants from each BC1 family were geno- 
typed at the chromosome of interest to select individuals 
carrying Kas-1 alleles so they could be self-pollinated to ge- 
nerate BC1S1 seed. BC1S1 plants were genotyped using 48 
genome-wide SSRs described in McKay et al [38]. These 
data were analyzed using an algorithm [see Additional file 2] 
designed to identify a subset of lines representing Kas-1 
chromosomal introgressions spanning the genome in other- 
wise Tsu-1 backgrounds. The algorithm was used to select 
103 BC1S1 plants which were screened at an additional 149 
single nucleotide polymorphisms (SNPs) loci using the 
Sequenom Mass ARRAY® (Sequenom, San Diego, CA). Only 
41 of 149 SNPs were informative for the parental lines. 
Finally, an additional 930 polymorphic loci were revealed 
using 2b-RAD [41] whereby genome complexity is reduced 
using class IIB restriction enzymes followed by sequencing on 
the SOLiD platform (Applied Biosystems, Foster City, CA). 

Polymorphisms detected between Tsu-1 and Kas-1 by 
2b-RAD genotyping 

Restriction site-associated DNA (RAD) tag sequencing 
reduces genome complexity by focusing only on DNA 
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Figure 1 Breeding scheme of the NIL library. Breeding scheme and graphical genotypes of a set of NILs containing both homozygous 
introgressions (Chromosome 1) and heterozygous introgressions (Chromosome 3) derived from a single RIL. Each diploid breeding line is 
represented by a single row of 5 chromosomes where red coloring represents Kas-1 genotypes; Blue, Tsu-1; Green, heterozygous. Graphical 
genotypes of 6 of the 75 lines are shown. 
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flanking the recognition sites of the selected restriction 
endonuclease [42]. The RAD method used in this study, 
described in [41] is a simple and effective means of dis- 
covering a large number of SNPs unique to the study 
population, avoiding the ascertainment bias associated 
with SNPs discovered via population surveys [43]. The 
2b-RAD method utilizes the type IIB restriction enzyme, 
Alfl, which operates by cleaving DNA both upstream 
and downstream of the recognition site. The resulting 
tags are uniform in length, making them ideal for ampli- 
fication and sequencing on next-generation platforms. 
Following digestion, tags were labelled with sample- 
specific oligonucleotide barcodes for multiplexed se- 
quencing. Finally, reads were quality filtered and aligned 
to a collection of Alfl sites in the Col-0 Arabidopsis re- 
ference genome (TAIR9) in order to assign a physical 
location to each SNP. 

Initially, 1319 polymorphisms were identified between 
the parent lines Tsu-1 and Kas-1 based on the 2b-RAD 
tags that were sequenced. Because these NILs are de- 
rived from a known pedigree of previously genotyped in- 
dividuals, we were able to filter to include SNPs that 
would segregate in the progeny, resulting in a final set of 
930 loci with high-confidence genotypes for use in sub- 
sequent population analyses. A non-trivial fraction of 
markers remained as missing data in each sample due to 
the stringent scoring criteria of our method. The majo- 
rity of uncalled loci in typical 2b-RAD datasets are 
discarded because of low coverage (Meyer, unpublished 
observations) so this problem could be mitigated with 
deeper sequencing. However, the known pedigree of 
these samples and the low level of recombination made 
it possible to accurately reconstruct haplotypes despite 
these missing data. The filtered data were used to con- 
struct graphical genotypes [44] of the NIL population, 
a subset of which are represented in Figure 2. We 
also provide a database of the genotypes for the entire 
NIL population [see Additional file 3]. In addition, both 
parental accessions have been re-sequenced and the 
genome-wide reads have been deposited in the Short 
Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/ 
sra.cgi) and posted on the 1001 Genomes Project web- 
site (www.1001genomes.org/) so the details of the 930 
SNPs utilized in this study can be accessed at these 
resources. 

Genomics of chromosomal introgressions in the NIL 
population and the added value of increased 
marker resolution 

Across the 75 NILs, the average number of homozygous 
introgressions per NIL was 1.35 and ranged from 0 to 4 
while the average number of heterozygous introgressions 
was 2.49 and ranged from 0 to 6 (Figure 3). The average 
number of introgressions per chromosome was 57.6, 



ranging from 34 on chromosome 2 to 79 on chromo- 
some 1 [see Additional file 4]. The total length of homo- 
zygous introgressions was 506 Mb compared to nearly 
949 Mb of heterozygous chromosomal introgression 
which represent 4.3 and 8.0 times the total length of the 
Arabidopsis genome, respectively. Together these results 
suggest we have reached our goal- the entire genome is 
represented as a Kas-1 introgression for each genotypic 
state (i.e. zygosity) in at least one NIL, thus enabling 
QTL validation and fine-mapping for any locus of 
interest. 

The additional loci accounted for by 2b-RAD geno- 
typing resulted in a final marker density of 2.24 markers 
per cM, based on the estimated 450.8 cM map of the 
Kas-1 x Tsu-1 RIL population. This is a significant im- 
provement in resolution from the 0.18 markers per cM 
when using only the original SSR and Sequenom mar- 
ker set (hereafter referred to as the coarse map). In spite 
of the high frequency of uncalled alleles, 128 new in- 
trogressions were revealed which summed to nearly 
539 Mb of DNA (164 Mb homozygous and 375 Mb he- 
terozygous) that would have been missed without the 
additional markers from 2b-RAD genotyping. To illus- 
trate this effect we re-sampled the dataset at varying 
marker densities (Figure 4). The exponential curve fit 
(r 2 = 0.99) used to estimate introgression detection be- 
gins to level above 800 markers, suggesting diminishing 
introgression discovery with more extensive genotyping. 
In a comparison of the NILs using the coarse map 
relative to the dense map created from 930 2b-RAD 
markers, the average size of a homozygous introgression 
in the coarse map was 18% larger (1.2 Mb) than in the 
dense map. Similarly, the average heterozygous intro- 
gression size in the coarse map was 19% larger (1.3 Mb), 
confirming that the additional markers were identifying 
smaller introgressions missed in the coarse map. This 
fact is highlighted by the total number of introgressions 
[see Additional file 4] discovered using the denser 
marker set. The result was a 1.8-fold (Figure 3) increase 
in the number of homozygous and heterozygous intro- 
gressions discovered. 

Case study: utilizing selections from the NIL library for 
QTL validation and sub-NIL development 

To demonstrate the value of this new resource, we ana- 
lyzed the RIL population [38] for QTL for night-time 
leaf conductance (g 0 ). go is a low-heritability, quantita- 
tive trait that is important for plant-water relations and 
mineral nutrition. While the adaptive value of g 0 has yet 
to be fully understood, incomplete stomatal closure dur- 
ing the night can lead to substantial transpirational 
water loss [45]. Variation in this trait has been found 
among and within species, and it correlates with some 
daytime gas-exchange traits such as water-use efficiency 
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Figure 2 Graphical genotypes of NILs representing (A) homozygous and (B) heterozygous introgressions cumulatively spanning the 
length of the genome. Red, Kas-1; Blue, Tsu-1; Green, heterozygous. 



(the ratio of C0 2 assimilation to transpiration) [46] . Esti- 
mates of transpiration have been found to be particularly 
sensitive to g 0 [47], making it an interesting candidate 
for studies on the physiology and genetics of plant 
drought adaptation. In view of that, intraspecific vari- 
ation in observed g 0 has been found to have the largest 
effect on transpiration across a species' native habitat 
(Bauerle, unpublished observations). 

Significant variation in night-time conductance was 
observed among the RILs. We identified a single QTL 
for go on the top of chromosome 1 (Figure 5A), which 
explained 9% of the variance in g 0 , and found the trait 
to have relatively low broad sense heritability (H 2 = 
0.21) in this population. Lines having Kas-1 alleles of 
markers at the QTL had lower dark conductance, 



consistent with the dry habitat of the Kas-1 parent [see 
Additional file 5]. Additional loci were identified on 
chromosomes 2 and 4 below the threshold of signifi- 
cance, which may have had marginal effects on g 0 [see 
Additional file 6]. 

To validate the QTL we selected two NILs homozy- 
gous for a Kas-1 introgression spanning the QTL and 
measured g 0 relative to Tsu-1 with the expectation that 
one or both would have a significantly lower g 0 value. 
NIL TK201_137_6 carries an introgression estimated to 
span physical positions 505,086 to 5,273,972 on chromo- 
some 1 and KT116_63_15 is estimated to carry a much 
larger introgression between positions 2,040,091 and 
19,225,223 (Figure 5B). KT116_63_15 also carried small 
heterozygous regions at either end of the homozygous 
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introgression number, estimated using the coarse and dense maps. 



introgression. Large and highly significant differences 
were found between both NILs and Tsu-1 (Table 1), 
providing strong evidence for the presence of the QTL 
and providing a surprisingly high estimate of the rela- 
tive difference in g 0 conferred by the two alleles when 
compared to the results of the initial QTL experiment 
[see Additional file 5]. The region between 5,273,972 
and 19,225,223 can be effectively eliminated from con- 
sideration for harbouring the causal locus since TK201_ 
137_6 was significantly different from Tsu-1 and did 
not carry Kas-1 DNA in this interval [24], It is worth 
noting that both NILs carried introgressions on chro- 
mosomes other than the chromosome one focal area. 
However, none of them were common between the NILs 
and the difference in g 0 values between KT116_63_15 
and TK201_137_6 was non-significant which suggests 
these introgressions were not impacting our results 
substantially. 

Nearly 1,500 genes are predicted to lie within the re- 
gion spanning physical positions 505,086 to 5,273,972 
of chromosome one. We have assembled a list of can- 
didate genes based upon hits to gene ontology (GO) 
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Figure 4 Heterozygous introgressions discovered at varying 
marker densities. Estimates are based on 10 repetitions of markers 
selected at random from the dense map for each marker density 
(mean ± SD). The line is fitted based on an exponential rise function. 
The data point marked by an arrow represents the final estimates 
generated from the dense map. The plot of homozygous 
introgression discovery was nearly identical and excluded to simplify 
the figure. 



terms relevant to stomatal conductance: abscisic acid 
(ABA), stomata and water [see Additional file 7]. 

To illustrate the power of deriving sub-NILs from hetero- 
zygote NILs, concurrent to the QTL validation experiment 
we planted seeds derived from a line heterozygous in 
the roughly 3 Mb g 0 QTL interval (Figure 5). We se- 
lected 5 polymorphic loci from a panel of validated 
SNPs described in [38] for genotyping a population of 
286 BC1S3 individuals (BC1S1 graphical genotype is rep- 
resented in Figure 6). The marker representing the lower 
end of the interval at physical position 6,839,609 did not 
segregate and all individuals were homozygous for the 
Tsu-1 allele. The genotype for the 2b-RAD allele near 
this location was scored as "not genotyped" in the ori- 
ginal BC1S1 genotyping so we were unsure exactly 
where this particular heterozygous introgression ended. 
In the end, we were left with 4 informative markers in 
the physical interval spanning positions 2,211,035 to 
6,572,582. We selected 17 recombinants (Figure 6) 
representing the majority of the recombination events 
possible. Unfortunately, no double recombinants were 
discovered so that a sub-NIL representing the Kas-1 al- 
leles at the middle of the interval could be recovered. 
However, heterozygous individuals TK176_108_1_4_13 
and TK176_108_1_4_38 were kept for selfing and will be 
available for re-planting to accomplish this since a cross- 
over has already occurred at the lower end. Ultimately, 
individuals were recovered in this single selfing gene- 
ration that could be used in the next generation for g 0 
phenotyping experiments to effectively narrow the QTL 
interval down to, at most, the 1.9 Mb interval between 
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Figure 5 QTL location and graphical genotypes of the NILs 
used in the QTL validation. (A) Localization of the dark- 
conductance QTL along chromosome 1, the dotted line indicates 
the threshold LOD score. (B) Graphical genotypes scaled to 
represent the genetic distance (cM) of the x-axis of panel A of 

chromosome 1 for the NILs used in the QTL validation. 
\ J 



markers Cl_2211035 and Cl_4142402, a n interval pre- 
dicted to carry about 590 genes. 

Discussion 

Maintenance of homozygous and heterozygous NILs 
facilitates simultaneous QTL validation and 
fine-mapping efforts 

Near-isogenic lines remain the ideal starting material for 
validation of QTL as well as breeding schemes designed 
for fine-mapping with the end goal being the identifica- 
tion of candidate genes [48-51]. QTL validation is rela- 
tively straightforward and consists simply of phenotyping 
NILs with introgressions at the region of interest for the 
trait of interest. Creation of a suitable population for fine 
mapping is not as straightforward and is normally a 
three-generation process that starts with a cross between 
an inbred NIL and the recurrent parent. This is typically 



Table 1 Results of QTL validation experiment comparing 
NIL g 0 values with Tsu-1 



Comparison 


g 0 


Standard 


Difference 


t value 




(mmol m-2 s-1) 


Error 


NIL - (Tsu-1) 




Tsu-1 


119.76 


11.55 


n/a 


-0.94 


TK201_137_6 


52.42 


10.99 


-67.34 ** 


-4.02 


HT116_63_15 


67.45 


11.40 


-52.30 ** 


-3.15 



Difference values with ** are highly significant (P < 0.0001). 



followed by a generation of self-pollination to allow for 
recombination in the introgression region. The seed 
harvested from these self-pollinated plants can then be 
genotyped with markers specific to the region so that 
homozygous sub-NILs can be identified. The process is 
fairly straightforward and inexpensive in the context of 
physical resource, but there is a time cost of at least 3 
generations (equivalent to a minimum of 18 weeks). 

Our case study illustrates the advantages of maintaining 
both homozygotes and heterozygotes in the NIL popula- 
tion, combining the benefits of traditional homozygous 
NILs with the advantages of HIFs [22,23]. For example, 
measuring g 0 on the homozygous NILs provided strong 
evidence for the presence of the QTL in a single gene- 
ration, thus avoiding the process of generating homozy- 
gous lines that would be necessary in HIF populations. 
These results provided a better estimate of the QTL effect 
size relative to the results derived from our QTL mapping 
approach and have justified further investments in fine- 
mapping using heterozygous NILs. This emphasizes the 
power NILs create by isolating the genetic factors control- 
ling a phenotype to a single locus as there were other loci 
worthy of consideration as contributors to variation in g 0 
in the RIL population. Analysis of the genes predicted to 
lie within this interval revealed a majority of them had GO 
annotations related to ABA, the major signalling molecule 
in stomatal regulation [52-54], but examination of the en- 
tire region with the AmiGO enrichment analysis tool [55] 
found it was not significantly enriched for ABA genes. In- 
spection of the physical location of these ABA-associated 
candidates reveals that they are clustered in a 1.2 Mb 
interval (At 1 physical interval: 712,473-1,894,148) which 
represents a relatively small portion of the 4.8 Mb intro- 
gression tested, thus providing an interesting focal region 
during fine-mapping of the g 0 phenotype. 

With regards to fine-mapping, selfing a heterozygous 
NIL selection from the population yielded several sub- 
NILs suitable for phenotyping or additional genotyping 
in future generations, an attribute common with HIF 
populations and advantageous over traditional NILs. 
This was accomplished using a modest population size 
of BC1S3 plants (n = 286) and the interval could be 
narrowed down further through genotyping at a higher 
number of loci and increasing the population size [56]. 
Regardless, in a six-week period we have identified a 
population encompassing recombinants in the 4.8 Mb 
region identified as causal during the QTL validation 
experiment, translating to a 3-fold change in total time 
versus a breeding scheme utilizing inbred NILs. 

2b-RAD is an efficient method for dense genotyping of 
recombinant populations 

Arabidopsis thaliana recently celebrated its 25th anni- 
versary as a model organism and now stands alone as 
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Figure 6 Diagram of genotype information for fine mapping lines. Heterozygous NIL selected for selfing (top) and the detailed focal region 
of the selected sub-NILs (bottom). The numbers at the bottom of the sub-NILs indicate the physical position of markers and, therefore, represent 
estimated introgression boundaries. Red, Kas-1; Blue, Tsu-1; Green, heterozygous. 



the most thoroughly studied plant species on record 
http://www.arabidopsis.org/, [57]. Recent efforts are pro- 
ducing comprehensive polymorphism databases (http:// 
www.arabidopsis.org/, http://signal.salk.edu/cgi-bin/AtSFP). 
To interpret the significance and functional consequences 
of this natural variation, we need to understand the multi- 
variate phenotypic consequences of these variants. NIL li- 
braries, mutants and complementation studies are the tools 
required for this mechanistic understanding. 

The 2b-RAD method added an additional 930 high 
confidence genotypes to our map providing a level of 
resolution not yet achieved in any of the Arabidopsis 
NIL populations described to date. The value of these 
additional markers is obvious as we compare the coarse 
and dense maps. The discovery of an additional 129 
introgressions is clearly important when making 



selections for QTL validation. For instance, three add- 
itional homozygous introgressions were discovered in 
KT154_2_3, changing the estimate from one to four. This 
is a clear illustration of the risks associated with utilizing 
NILs genotyped at low density in experiments aimed at 
QTL validation. These offsite introgressions may have ef- 
fects on the phenotype of interest, potentially resulting in 
erroneous or uncertain conclusions regarding the QTL ef- 
fect size and location. 

The Kas-1 x Tsu-1 RIL and NIL populations are a valuable 
resource for research on the genetics of drought 
adaptation in the Brassicaceae 

Substantial variation for several traits relevant to drought 
adaptation have been observed in the Kas-1 x Tsu-1 RIL 
population including S 13 C, leaf water content, 
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instantaneous transpiration rate, flowering time, abscisic 
acid content and root mass [38], unpublished results]. Ac- 
cordingly, the NIL population described herein is expected 
to vary for the same traits, providing a powerful resource 
for moving from QTL, encompassing thousands of genes, 
discovered in the RIL population towards a smaller list of 
putative functional candidates. 

No other plant species has been more studied or char- 
acterized than Arabidopsis thaliana [57]. A high degree 
of sequence collinearity between it and members of the 
agriculturally significant Brassica genus was discovered 
over a decade ago [58]. Similar levels of synteny have 
been found in comparisons with other taxa in the 
Brassicaceae [59-62]. These results suggest that transla- 
tional genomics, that is utilizing basic research findings 
in model organisms to answer practical research ques- 
tions in species of higher economic value or importance 
[63,64], could be a viable avenue in understanding com- 
plex traits. In this regard, we suggest the Kas-1 x Tsu-1 
populations as the ideal starting point for basic research 
on the genetics and genomics of drought adaptation. 

Conclusions 

We have developed a population of 75 NILs that pro- 
vides genetic resources for fine-mapping QTL as well as 
QTL corroboration. The high marker density used to 
construct the population provides a level of resolution 
not yet seen in a NIL population, thus minimizing ambi- 
guity in fine-mapping and QTL validation studies caused 
by unidentified chromosomal introgressions elsewhere 
in the genome. The unique variation that exists between 
the parents used to construct this resource provides a 
valuable asset for research focused on identifying the 
genes responsible for drought adaptation in Arabidopsis 
and beyond. 

Methods 

Plant material & growth conditions 

The A thaliana accessions Kas-1 (CS903) and Tsu-1 
(CS1640) were used as the original parent lines in devel- 
oping the RIL population of 346 lines. Kas-1 and Tsu-1 
were chosen as parents for developing this population as 
a result of their extreme differences in water use effi- 
ciency as measured by 5 13 C [39,40]. RILs from this 
population served as the starting point for the NIL 
breeding program described below. 

For the QTL experiment, seed of the RILs along with 
the parents were sown on soil (Fafard 4P mix, Conrad 
Fafard Inc., Agawam, MA) in 3-inch pots. Seeds were 
planted in a randomized complete block design consisting 
of 2 blocks, and then the pots were refrigerated at 4°C in 
darkness for 5 d to cold-stratify the seeds prior to com- 
mencement of a 8:16 h (light: dark) photoperiod in 
Conviron ATC60 growth chambers (Controlled 



Environments, Winnipeg, MB), at 23°C and 40% humidity 
during the day and 20°C and 50% humidity during the dark 
period. Light intensity was approximately 330 umol m" 2 s" 1 . 
Plants were grown for approximately 6 weeks prior to 
measurement. Stomatal conductance was measured in 
darkness on non-senescing leaves that were large enough 
to fully accommodate the leaf chamber (1 cm x 2 cm), 
using an infrared gas analyser (model Li-Cor 6400, LiCor 
Inc., Lincoln, NE). Prior to measurement the plants were 
dark adapted for 20 - 28 h. A humidifier was used to re- 
duce variation in humidity over the course of the measure- 
ments. For each leaf 10 measurements were taken, with an 
interval of 10 s between measurements. 

For the QTL validation experiment, plants were grown 
in a randomized complete block design consisting of 3 
blocks where each genotype was replicated 6 times 
within each block. Plants were grown under exactly the 
same conditions as those described above except that 
the photoperiod was increased to 12; 12 h (light:dark) to 
accommodate other experiments conducted in the same 
chamber. One major difference between the two experi- 
ments was the use of leaf porometers (model SC-1, 
Decagon Devices, Inc., Pullman, WA) rather than an in- 
frared gas analyser for stomatal conductance estimates. 
Two non-senescing leaves were measured on each plant 
following the manufacturer s recommended protocol. 

Genetic analyses 

Broad-sense heritability was estimated by calculating the 
ratio V G :Vp> where V G is the among-RIL component of 
variance and V P is the total phenotypic variance. QTL 
mapping was performed in the R/qtl program of the R 
statistical package [65,66], using Haley-Knott regression. 
Significance thresholds were determined using 1000 per- 
mutations. A penalized stepwise approach [67] was used 
for selection of a multiple-QTL model. 

For the QTL validation experiment, data were ana- 
lyzed with a linear mixed model using PROC MIXED in 
the SAS software package (SAS Institute Inc. 2003, Cary, 
NC) where block, row and column effects were treated 
as random. 

Marker assisted NIL breeding program 

To start, 7 RILs were selected from the original popula- 
tion of 346 using the code supplied in an additional file 
[see Additional file 1]. These 7 represented lines homo- 
zygous for Kas-1 alleles across one of each of the 5 chro- 
mosomes and all were crossed to Tsu-1 at least 10 
times. Some attempted crosses may result in self- 
pollination due to technical error, thus we genotyped 
progeny to confirm they were Fls. In general, the real 
Fls were several times larger than the midparent value, 
so geno typing was almost unnecessary. Confirmed Fls 
were crossed back to Tsu-1 and each fruit was collected 
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separately and considered a BC1 family, ultimately crea- 
ting 25 families. 24 plants from each family were geno- 
typed at the chromosome of interest and selected for 
selfing to generate BC1S1 seed. In addition to culling 
the occasional plant generated due to self-pollination, it 
was also necessary to remove individuals sired by (hap- 
loid) pollen from the Fl carrying Tsu-1 alleles for the 
chromosome of interest. In the next generation, 690 
BC1S1 plants were genotyped with the 48 genome-wide 
SSRs described in [38]. These were then ranked using an 
algorithm [see Additional file 2] to find lines that were 
largely Tsu-1, but carrying Kas-1 introgressions spanning 
the genome. In the end, 75 lines were selected which we 
screened at an additional 149 loci using the Sequenom 
Mass ARRAY® platform, of which 41 were polymorphic. 
930 polymorphic loci were added to this marker data set 
via 2b-RAD [41] where class IIB restriction enzymes are 
used minimize genome complexity for a final total of 
1011 genotyped. 

DNA extraction and genotyping 

Genomic DNA was isolated from lyophilized tissue col- 
lected from approximately 4-week-old, chamber grown 
plants using the DNeasy Plant Mini Kit (Qiagen, Valencia, 
CA) according to the manufacturer s instructions. 

The 48 polymorphic microsatellites used in this study 
were selected from the large number of those available 
in A. thaliana [68,69], arabidopsis.org] due to easily dis- 
tinguishable allele calls. Descriptions of the primers, 
PCR conditions and allele scoring are explained in [38]. 

DNA samples were used to prepare 2b-RAD libraries 
as previously described [41]. A detailed protocol is avail- 
able at the Meyer laboratory website (http://people. 
oregonstate.edu/~meyere/). Briefly, library preparation 
for 2b-RAD genotyping began with digestion of gDNA 
samples with Alfl (Fermentas) for 37°C for 3 h followed 
by ligation of adaptors at 4°C for 16 h. Ligation products 
were amplified by PCR and barcodes introduced to gel- 
extracted products in a second PCR reaction. Finally, li- 
braries were pooled for multiplex sequencing on the 
SOLiD sequencing platform (Applied Biosystems). Raw 
sequences were processed to exclude low-quality reads, 
and the HQ reads that remained aligned in color-space 
using the SHRiMP software package [70] to Alfl sites 
extracted from the Arabidopsis genome (TAIR9). A cus- 
tom Perl script was applied to eliminate short, statisti- 
cally weak and ambiguous alignments (reads matching 
multiple sites equally well). Finally, genotypes were de- 
termined from nucleotide frequencies using custom 
Perl scripts to classify each locus as homozygous 
(minor allele frequencies [MAF] <1%), heterozygous 
(MAF > 25%), or undetermined (1% > MAF >25%). 20x 
coverage was required in the parental genomes to iden- 
tify these alleles with high confidence, and a relaxed 



threshold of lOx in all other samples to maximize marker 
densities. Each polymorphic locus identified in these 
genotypes was compared with the parental genotypes 
(Tsu-1 and Kas-1) to assign it to one of these back- 
grounds, a comparison that would obviously not be 
possible for any loci genotyped in one parent but not 
the other as a result of variation in sequencing cove- 
rage. To reduce the effects of such missing data, we 
imported genotypes for Tsu-1 and Kas-1 from resequencing 
data (McKay, unpublished results) for any loci genotyped 
in one parent but not the other. 

KASP SNP genotyping assays (LGC Genomics, 
Teddington, Middlesex, UK) were used for sub-NIL de- 
velopment. Primer sequences [see Additional file 8] were 
designed using sequence data from TAIR10 [71] for 
amplification of SNPs identified and validated on the 
SNPlex genotyping system (Applied Biosystems) as de- 
scribed in [38]. KASP is a novel allele-specific PCR assay 
that utilizes a FRET (Fluorescence Resonance Energy 
Transfer) system. In short, along with a common primer, 
allele-specific primers are designed to include a unique 
18 bp sequence at the 5' end. The unique sequences are 
identical to a pair of oligonucleotides with 3' bound 
quenchers for a complement pair of 5' fluorescently la- 
belled oligos inside the reaction mix. During PCR, allele 
specific amplification leads to the generated product(s) 
outcompeting the quencher containing oligos for bin- 
ding to the fluorescently labelled oligos, allowing for an 
observable signal to be measured using a light reader. 
The intensity of the signal (s) allows for a quantitative 
measure of SNP copy number. 

Estimating chromosomal introgression length 
and number 

The physical length of introgressions in the final NIL li- 
brary was estimated using graphical genotypes [44]. Phys- 
ical length estimates of introgressions flanked by SSR 
markers were made using the location of the forward 
primers, SNP locations were determined by their location 
in the Col-0 reference genome. To avoid false-positives, an 
introgression was scored based on the presence of at least 
3 consecutive markers with the Kas-1 genotype. Introgres- 
sion boundaries were then defined by three consecutive 
markers with an alternative genotype. This helped avoid 
over-estimating introgression numbers due to occasional 
incorrect allele calls or differences in the location of loci in 
this population relative to the Col-0 genome used as a 
reference for mapping sequence reads. For the analysis 
of introgression discovery at varying marker densities an 
Excel Macro was written to sum the number of hetero- 
zygous and homozygous introgressions discovered. The 
loci included in replicated sampling were selected ran- 
domly using Excels RAND function. 
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Candidate gene identification 

The full list of genes expected to lie within the QTL 
interval spanning physical positions 505,086 to 5,273,972 
was assembled using TAIR10 [71]. GO annotations for 
the full gene list were downloaded using the Bulk Data 
Retrieval and Analysis tool on TAIR10 and searched 
using the terms abscisic acid (ABA), stomata and water. 
Gene enrichment analysis was performed using the GO 
enrichment analysis tool in AmiGO [55]. 
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